Search CORE

2 research outputs found

Microarchitectural-level simulator for parallel tile rendering on mobile GPUs

Author: Tomás Berjaga Aurora
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/07/2022
Field of study

Mobile devices have led the boom in the technological segment in the recent years. They have witnessed a tremendous improvement in screen resolution and high-quality graphics because of the growing demand for playing games and other animated graphics applications. However, the demand for rendering more realistic scenes brings with it a significant increase in computation and memory bandwidth. This inevitably translates to an increase in energy consumption. Since GPUs are battery operated, energy-efficiency is an important design factor as it dictates their autonomy. In this work, we present a novel technique which we term Parallel Tile Rendering (PTR), which aims to exploit new sources of parallelism in a GPU. Under PTR, we rasterize multiple tiles in parallel using two different rasterization lanes, called Raster Units, in architectures for mobile GPUs. In this way, we dramatically reduce the required cycles for rasterization, which has been seen to be the most time-demanding process when rendering images. Experimental results show that PTR can achieve an average speedup of 83% for a wide range of different benchmarks, each of them with different characteristics. In fact, it is much more effective than having the same amount of computing resources but in a single Raster Unit, with an increase in performance of 8.3% on average. Moreover, PTR provides significant energy savings with an average decrease of 9.86%

UPCommons. Portal del coneixement obert de la UPC

Gestión de límites de energía globales para centros HPC

Author: Tomás Berjaga Aurora
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/07/2020
Field of study

Actualmente la gestión energética se ha convertido en un tema fundamental de investigación en el ámbito HPC, pues los centros de supercomputación son grandes consumidores de energía, lo cual es un desafío tanto a nivel ecológico como económico. Sin embargo, la gestión de la energía no es algo banal ya que influyen otros factores que no solamente residen en la arquitectura en la cual se trabaja. Por este motivo es necesaria la ayuda de un software que se encargue de esta gestión. Este proyecto se ha desarrollado dentro del contexto de Energy Aware Runtime (EAR), el cual es un software cuya función es gestionar la energía en centros HPC. EAR está formado por diversos componentes que se coordinan entre sí, entre los cuales se encuentra el Global Manager, que es el núcleo de este trabajo. El Global Manager es el encargado de garantizar que el consumo energético durante un periodo de tiempo en un sistema no sobrepase un límite energético establecido, cuya funcionalidad recibe el nombre de energy capping. Este componente puede funcionar en dos modos distintos: modo manual y modo automático. El primero se limitará a controlar el consumo energético que está habiendo en el sistema, dejando que el administrador de sistemas sea quien realice las acciones que considere convenientes para el cumplimiento de los límites. El modo automático, además de monitorizar, es capaz de adaptar la configuración del sistema dinámicamente. El modo manual se encuentra cubierto y en producción. Por contra, para el modo automático solo se encuentra un diseño inicial y carece de una evaluación, por lo que el objetivo de este trabajo es permitir que el Global Manager pueda operar automáticamente, buscando ser más flexibles. Para ello se ha realizado una evaluación del estado inicial en que, en función de los resultados, se ha extendido la API añadiendo mejoras en funcionalidades y se han optimizado los ajustes dinámicos del Global Manager para una recuperación más rápida. Se han diseñado un conjunto de experimentos con los que se ha evaluado extensamente el componente con todas las optimizaciones, llegando a utilizar hasta 26 nodos (1024 cores), aproximadamente un 10% de los recursos de la máquina en la que se ha evaluado. Finalmente, destacar que la nueva versión se adapta más rápidamente a las variaciones en la carga de trabajo controlando que no se exceda de los límites, según demuestran los experimentos realizados.Currently, energy management has become a fundamental research topic in the HPC field, since supercomputing centers are large consumers of energy, which is a challenge both ecologically and economically. However, energy management is not banal since other factors influence that not only reside in the architecture in which it works. For this reason, the help of a software that is in charge of this management is necessary. This project has been developed within the context of the Energy Aware Runtime (EAR), which is a software whose function is to manage energy in HPC centers. EAR is made up of various components that coordinate with each other, including the Global Manager, which is the core of this project. The Global Manager is in charge of guaranteeing that the energy consumption over a period of time in a system does not exceed an established energy limit, whose functionality is called energy capping. This component can work in two different modes: manual mode and automatic mode. The first will be limited to controlling the energy consumption that is taking place in the system, leaving the system administrator to carry out the actions it deems appropriate to comply with the limits. Automatic mode, in addition to monitoring, is able to adapt the system configuration dynamically. Manual mode is covered and in production. In contrast, for automatic mode, only an initial design is found and lacks an evaluation, so the objective of this project is to allow the Global Manager to operate automatically, seeking to be more flexible. For this, an initial state evaluation has been carried out in which, based on the results, the API has been extended adding improvements in functionalities and the dynamic settings of the Global Manager have been optimized for a faster recovery. A set of experiments has been designed with which the component has been extensively evaluated with all the optimizations, using up to 26 nodes (1024 cores), approximately 10\% of the resources of the machine in which has been evaluated. Finally, it should be noted that the new version adapts more quickly to variations in workload, controlling that the limits are not exceeded, as shown by the experiments carried out

UPCommons. Portal del coneixement obert de la UPC